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1. Introduction 

The purpose of this report is to discuss the outer product/tensor product 
and a special case of the tensor product: the Kronecker product, as well as algo- 
rithms, their origin, and optimal implementation when composed, and mapped 
to complex processor/memory hierarchies [l3|, 0, d, H [H, 0, 0, EH, E3]- We 
discuss how the use of A Mathematics of Arrays (Mo A), and the ip - Cal- 
culus, (a calculus of indexing with shapes), provides optimal, verifiable, re- 
producible, scalable, and portable implementations of both hardware and soft- 
ware 0,0,0,0. This is due to the fact that we are using normal forms composed 
of multi-linear operations on Cartesian coordinates which are transformed into 
simple abstract machines: starts, stops, strides, count, up and down the 
processor/memory hierarchy. Before turning the the discussion at hand we in- 
vite the reader to consult the appendix for motivational background illustrating 
how tensor/Kronecker products and diadics arise naturally in applied problems 
in physics and engineering. 

A key notion of the present work is how the MoA outer product can be 
formulated as the Kronecker product, a special case of the Tensor product. We 
will show that the use of the MoA outer product is superior to the traditional 
approach when one is concerned with efficient implementations of multiple Kro- 
necker products. The MoA outer product is a general operation on two arrays 
of any shape or dimension and applies any scalar operation, not just the prod- 
uct (*) on these two arrays (i.e. +, — , /, etc. are valid operations). For now, 
we focus on the relationship to the Kronecker product between two matrices of 
arbitrary size resulting in a block matrix. Let's begin with an example where: 

5 6 7 8 " 
9 10 11 12 , 
13 14 15 16 

the operation: 

A(g)B, 

is defined by the operation in Fig. [TJ 

Note implicitly in the operation above, that the 4 multiplications applied to 
B have a substructure within the resultant array. That is, EACH component of 
A is multiplied with ALL of B creating 4, 3 x 4 arrays. The result is stored in 
a matrix, C, by relating the indices of A, i.e. i,j, with the indices of B, i.e. k,l. 
and encoding them into row, column coordinates. Classically, i,k is correlated 
to a row, and j,l is correlated to a column. More on this later. 

A goal of this paper is to describe how shapes are integral to array/tensor 
operations. By definition, the shape of an array is a vector whose elements 
equal the length of each corresponding dimension of the array. Using shapes, 
we will relate operations in A Mathematics of Arrays (MoA) to tensor algebra 
and we will show how these shapes and the ^-Calculus (also sometimes written: 
Psi-calculus) can be used to compose multiple Kronecker products and map 
such operations to complex processor/memory hierarchies. . 
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Figure 1: Kronecker product of A and B: A matrix. 



2. Shapes and the tp operator 

Let's begin by introducing shapes. The shape of A is 2 by 2, i.e. pA = < 
2 2 >, the shape of B is 3 by 4, i.e. pB = < 3 4 > and the shape of A® B is 6 
by 8, i.e. p{A(£) B) = < 6 8 >. In this discussion we have introduced the shape 
operator, p, which acts on an array and returns its shape vector. 

Now, let's look at the MoA outer product of A and B, denoted by A op x B. 
The shape of A op x B is the concatenation of the shapes of A and B, i.e. a 4- 
dimensional array with shape 2x2x3x4. That is, p (A op x B) = < 2 2 3 4 >. 
The resulting array is indexed by a vector <i j k£> that is ordered in row- major 
order (i.e. in the order of a nested {ijk£} loop with £ the fastest and i the 
slowest increasing partial index). 
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Figure 2: MoA Outer product of A and B: A 4-d array 



The MoA array operation: A op x B is defined by the result in Fig.O 
Notice that the layouts in Figs. [1] and [5] are very similar. What is different 
is the bracketing. The result of the MoA outer product is NOT a matrix but 
is rather a multi-dimensional array. In contrast, the result of the Kronecker 
product IS a matrix (i.e. a two-dimensional array). The extra brackets reflect 
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the fact that the result of the outer product is a 4-dimensional array whose 
shape is obtained by concatenating the shapes of the arguments, i.e. < 2 2 > 
concatenated to < 3 4 > equals < 2 2 3 4 >. So do these arrays have the 
same layout in memory? The answer is no. What is interesting, however, 
is that when the Kronecker product is executed it is filled in, in a row major 
ordering relative to the right argument. The layout, either row or column major, 
would reflect the access patterns needed to optimize these operations across the 
processor/memory hierarchy. Let's assume row major. Thus flattening (i.e. 
creating a vector consisting of the elements of the array in row- major order), 
the difference in layout is as follows: 

(1x5 1x6 1x7 1x8 2x5 2x6 2x7 2x8...) (1) 

Figure 3: Kronecker product flattened using row-major layout 



(1x5 1x6 1x7 1x8 1x9 1x10 lxll 1x12...) (2) 

Figure 4: MoA Outer product flattened using row-major layout 

Before we continue, let's discuss how languages implement these operations. 
Typically, assuming A, B, and C are defined as n by n arrays, the operation: 

A(g)B(g)C 

would materialize all of B (^) C as a temporary array, let's call it TEMP. Then 
it would perform A^ TEMP. If n is large, this could use an enormous amount 
of space. 

Now, let's look at how MoA and ^-Calculus would perform the outer prod- 
uct. Then, we'll discuss how we can restructure the MoA outer product to 
get the Kronecker product and in so doing we'll be able to compose multi- 
ple Kronecker products efficiently and deterministically over complex 
processor /memory hierarchies. 

2.1. Shapes and the Outer product 

Before beginning, we refer the reader to the numerous publications on MoA 
and the V'-Calculus, the most foundational is given in Ref. 0. We thus take 
liberty to use operations in the algebra and calculus by example. Only when 
necessary will a definition be given. 

Definition 1. Assume A, B, C, are nxn arrays, that is, each array has shape: 

pA = pB = pC = < n n > . 

Assume the existence of the ip operator and that it is well defined for n- dimensional 
arrays. The %j) operator takes as left argument an index vector and an array as 
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the right argument and returns the corresponding component of the array. For 
a full index (i.e. as many components are there are dimensions) a scalar is 
returned and for a partial index, a sub-array is selected. Then, 

D = Aop x (Bop x C) 

is defined when the shape of D is equal to the shape of A op x (B op x C). And the 
shape of A op x (B op x C) is equal to the shape of A concatenated to the shape 
of (B op x C) which is equivalent to the shape of A concatenated to the shape of 
B concatenated to the shape of C. i.e. 

pD = p(A op x (B op x C)) = pA-^p(B op x C) = pA^pB^pC = <nnnnnn> 
Then, V i , j , k ,l ,m ,n s.t. 

< io < n; < jo < n; < k < n; < lo < n; < m < n; < n < n 

< io jo k l m n > ip D = (< i jo > ip A) x (< k l m n > tp (B op x C)) 

= (< io jo > ip A) x (< k l > ip B) x (< m n > ip C) 

It is easy to see that we can compose as little or as much as we like given 
the bounds of io, jo, ko, lo, wio and no- We'll return to how to build the above 
composition. We'll also discuss how to include processor memory hierarchies but 
first we'll discuss how to make the layout of the Kronecker product equivalent 
to the layout of the MoA outer product. 

2. 2. Permuting the indices of the MoA outer product 

In order to discuss permuting the outer product we must first discuss how 
to permute an array. One way is through a transpose. We are familiar with 
transposing an array, i.e A T . We know that ^4[j;i] denotes A T [i, j]. Let's now 
discuss how to transpose a matrix in MoA and then how to transpose an array 
in general. 

Definition 2. Given the shape of A is m by n, i.e. pA =< mn >. then A T is 
defined when the shape of A T is n by m. That is, 

pA T =< n m > . 

Then, for all < i < n and < j < m 

<i j > ipA T =< j i > ipA 

Let's now generalize this to any arbitrary array. 

Definition 3. Given the shape of A is<mnopqr>. Then A T is defined 
when the shape of A T is<rqponm> Then for all < io < r; < jo < q; 
< ko < p; < lo < o; < mo < n; < no < m; 

< io jo k lo mo n > ipA T =< n m l k jo io > tpA 
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A question should immediately come to mind. Can the indices permute in other 
ways other than reversing them? The answer is yes, and in fact any permutation 
consistent with the shape of the array is achieved by simply permuting the 
elements of the index vector. Note that the definitions for general transpose 
and grade up presented herein are the same definitions proposed to the F90 
ANSI Standard Committee in 1993 and subsequently accepted for inclusion in 
F95. 

Definition 4. The operator grade up is defined for an n-element vector con- 
taining positive integers in the range from to n — 1 in any order (multiple 
entries of the same integer are allowed). The result is a vector denoting the po- 
sitions of the lowest to the highest such that when the original vector is indexed 
by the result of grade up, the original vector is sorted from lowest to highest. 
Example: Given a = < 2 1 3 >, gradeup[ a ] = gradeup[< 2 1 3 >] = 

< 1 2 3 >. Thus, a[gradup[a\] = a[< 1 2 3 >] =< 2 1 3 > [< 1 2 3 >] = 

< 1 2 3 > . 

To clarify this example we state the operations in words. The O'th element of 
the index vector is 1, implying that the element in position 1 of the vector a, 
i.e. 0, should be placed in the O'th position of the result. The l'st element of 
the index vector, 2, implies that the 2'nd element of a, i.e. 1 should be placed in 
the l'st position of the result and so on. We are now ready to define a general 
transpose for n-dimensional arrays. 

Definition 5. Given an array A with shape s such that the total number of 
components in s denotes the dimensionality d, of A. A T i is defined whenever 
the shape of A T t is s[t\, i.e. pA T t = s[t\. Then, for all <* i <* s[t\ (the 
symbols <* and <* imply element by element comparisons): 

itpA Tr = i[gradeup\t?\]il)A 

Example: Given 
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We first look at A T<2 1 0> and note that this is equivalent to A T . The shape of A 
is < 2 4 3 > so the shape of A T < 2 1 0> is < 2 4 3 > [< 2 1 >] =< 3 4 2 >. Then 
for all 

<* <i j k> <* < 3 4 2 > (this is a shorthand notation for < i < 3; 
< j < 4; < k < 2 ) we have: 

<ijk> ipA T<2 1 0> = (<ijk> [gradeup[< 2 1 >}})ipA 
= {<ijk>[<2l0 >])ipA 
= < k j i > ipA (3) 
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Now let's look at another permutation of A noting there are 6 possible permu- 
tations, i.e. < 1 2 >,< 2 1 >,< 1 2 >,< 1 2 >,< 2 1 >, and 

< 2 1 >. This time let's look at A T<201 >. Now the shape of A T<201 > is 

< 2 4 3 > [< 2 1 >] =< 3 2 4 >. Then for all <* <i j k> <* <3 2 4> 



<ijk> ifiA T<2 *> = 



(< % j k > [gradeup[< 2 1 >}])ijjA 
(<ijk>[<120 >])ipA 
< j k i > i\)A 
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3. Changing Layouts using Permutations 
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Figure 5: Transpose of MoA Outer product of A and B: A 4-d array 

Now that we know how to permute an array over any of it's dimensions we 
can reorient the MoA outer product to have the same layout as the Kronecker 
product or if we desire, we can reorient the Kronecker product to have the same 
layout as the MoA outer product. The pros and cons of each layout will be 
discussed in a later section. 

Recall the layouts of the Kronecker product in Fig. [T] and the MoA outer 
product in Fig. O Let's first permute the MoA outer product such that it has 
the same layout as the Kronecker product, and study the 4-d array defined by 
the MoA outer product in Fig. [2] Now observe the array in Fig. [5] Flattening 
this 4-d array gives us the layout we want. Notice which dimensions changed 
between the initial outer product in Fig. [2] and the transposed outer product in 
Fig. El The shape went from 2x2x3x4 to 2x3x2x4. Reviewing equations 
[T]and[2]we want 1 times 5, 6, 7, and 8 to be next to 2 times 5, 6, 7, and 8, etc. in 
the layout. Thus, we want to leave the 0th dimension alone, the 3rd dimension 
alone and we wanted to permute the 1st dimension with the 2nd. Consequently, 
we want (^4 op x B) T<0 2 1 3> , i.e. the < 2 1 3 > transpose of the outer product 
of A and B. Notice that this is the SAME permutation used in correlating the 
indices of A and B with the indices of the Kronecker product, i.e. resulting 
matrix, i.e. i, j,k,l — > i,k,j,l. 

Recall that this is the same permutation we discussed for the transpose of the 
MoA outer product. We now can discuss how to optimize these computations. 
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Using MoA and tp Calculus, one can not only compose multiple indices in an 
array expression but, the algebraic reformulation of an expression can include 
processor /memory hierarchies. This is done by increasing the dimensions of the 
arguments. Through various restructurings, an expression can easily describe 
how to scale and port across complex processor /memory architectures. 

Unless familiar with the topic, see the Appendix which gives a historical 
perspective of the Kronccker product and illustrates how pervasive the inner and 
outer products are throughout science. That said, an efficient, correct, scalable, 
portable implementation becomes paramount, e.g. accurate simulations and 
reproducible computational experiments rely on this. 

History shows us how the resultant matrix of the Kronecker product is eval- 
uated and indexed. The permutations on the input matrices in conjunction with 
an equivalent permutation on the corresponding shapes followed by a pairwise 
multiplication determines not only the resultant shape but how to store the 
results in its associated index of the resultant array. This cumbersome com- 
putation and encoding into new 2-d indices gets more and more complicated 
as the number of successive Kronecker products increases. Moreover, issues of 
parallelization complicate the problem since various components in the left ar- 
gument are used over the columns of the result, assuming the partitioning was 
done by rows. Other partitions are possible: blocks, columns, etc.. When the 
input matrices are large the problem is further complicated. This is not the 
case in MoA and ip Calculus. 

4. Multiple Kronecker products 

Multiple Kronecker products are common in conjunction with inner products 
and permutations such as transpose. How can these be optimized to use basic 
abstract machine instructions at all levels up and down the processor/memory 
hierarchy: start, stop, stride, count? 

Presently, multiple Kronecker products require the materialization of each 
pair of products. Notice what happens. After each pair of products, the result 
must be stored using the permutations of the indices of the argument arrays and 
encoded into row/column coordinates in a new matrix with size equal to the 
product of the pairs of permuted shapes. For example, if the input arrays were 
2x2 and 3x3. The resultant shape would be a (2 x 3) by (2x3), i.e. 6x6. 
Now, if we then did a Kronecker product with a 2 x 2, the results would be a 12 
x 12. With each subsequent Kronecker product we'd need to store the product 
in the rows and columns associated with the permuted indices. Ideally, we want 
to compose multiple products in terms of their indexing. MoA and ^-calculus 
are ideally suited for this approach and easily facilitate not only the composition 
of multiple Kronecker/outer products but their mapping to complex processor 
memory hierarchies. 

To illustrate, let A be a 2 x 2 and B a 3 x 3 array. We are not concerned 
with the specific values of the matrix elements since we need only to consider 
manipulations of the indices. We assume the arithmetic is correctly defined. 
We'll perform E = (A (g) B) (g) A. The result within the parentheses would 
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have shape 6x6. This was due to the two input array shapes, i.e. 2x2 and 
3x3. Using, i,j in A and k,l in B bounded by their associated shapes, we 
combine i,k with the associated shape from that array, i.e. 2,2 and 3,3 are 
analogously permuted, then multiplied. Thus 2,3 and 2,3 become the new row, 
column associations. These are then multiplied together to become the new 
number of rows and columns, i.e. new shape. The shapes above are used to 
encode the location of each Kronecker product operation. In other words, the 
composite index i, k indexes the rows of (A B) while the composite index j, k 
indexes the columns of (A(g)B). The resultant array, let's call it C, is shown 
in Figure O The input arrays are A and B (see Figures [7] and [^respectively). 

1 2 2 4' 

3 4 5 6 8 10 

6 7 8 12 14 16 

3 6 4 8 

9 12 15 12 16 20 

18 21 24 24 28 32 

Figure 6: 



Now let's perform C A. The result is E, see Figure [9l 
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Figure 8: 



Recall that the result matrix is filled in by 6, 2 x 2 blocks, over the rows 
and columns using the encoding discussed above. Notice how complicated the 
indirect addressing becomes using this approach to implementation of the Kro- 
necker product. Notice also that if we wanted to distribute the computation 
of a block of rows to 4 processors, we'd need multiple components of the left 
argument. 

Let us now look at doing the same operations, i.e. multiple outer products, 
using the Mo A ip calculus approach, C — Aop x B, as seen in Figure [TQl is a 4-d 
array with shape 2x2x3x3. It is easy to see that indexing this array with 
partial indices yields 3x3 sub-arrays. That is, the indices, <00>, <01>, 
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< 1 > and < 1 1 > are used to index C and each sub-array would be sent to 
available processors 0-3, to create a start, stop, stride, mapping suitable for all 
architectures to date. 

Now let's perform C op x A. This would yield a 6-d array with shape 2 x 2 x 
3x3x2x2. We can easily pull apart the arguments in the operations. Let's 
now think of this array asa4x3x3x2x2. We then use the 4 to index the 
processors. We know the blocks have 36 components. 

The following expressions illustrate how easy it is to compose, map, and 
scale to a multi-processor architecture. We first get the shape. 

p((A op B) op A) = (p(A op B) -H-(pA)) 
= ( P A) +^( P B) Vr{ P A) 
= <223322> 
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The indices are composed as follows: Given <* <ij> <* <22>; 
<* <kl> <* <3 3>;and0<* <mn> <* <2 2>and 
for all 0<* <ijklmn> <* <2 2 3 3 2 2>; 

<ijklmn>tp ((A op x B) op x A) = (< i j k I > ip (A op x B)) x (< m n > ipA) 

= (< i j > 4>A) x (< k I > ipB) x (<mn> tpA) 

From here we can easily map chunks to the four processors using starts, stops, 
and strides. 

Let's take the above, referred to as the Denotational Normal Form (DNF) 
expressed in terms of Cartesian coordinates and transform it into its equivalent 
Operational Normal Form, (ONF), expressed in terms of start, stop, stride and 
count. The DNF is independent of layout. The ONF requires one. Let's assume 
row-major. We'll see how natural that is for the Kronecker product at all levels 
of implementation. 

Let's break up the above multiple Kronecker product over 4 processors. We'll 
need to restructure the array's shape <223322>, to<43322>. This 
allows us to index the first dimension of this abstraction over the processors. 
We'll also index the first component of the leftmost argument by this value. 
Notice that the entire right argument is used/accessed in all of the processors. 
Thus, we think of the entire result of both products residing in an array with 
7r<43322> = 144 components (the tt operator gives the product of 
the elements of the vector) laid out contiguously in memory using a row-major 
ordering. 

Thus the equation above becomes for < p < 4 

<ijklmn>ip ((A op x B) op x A) = <pklmn>ip ((a op x B) op x A 

= ((< p > ilia) x (< kl > ij)B)) x (< mn> tpA) 

The expression below describes what each processor, p, will do. a above 
denotes the restructuring of A. avec and bvec are used to describe generic 
implementations. V p,q,r s.t. < p < 4 ; < q < 9; 0<r<4 

( avec [p] x bvec [q ] ) x avec [r] 

We are able to collapse the 2-d indexing for A and B since their access is 
contiguous. This type of thinking and reasoning has been used for over 20 
years[§,0,ll|- 

5. Conclusion 

The purpose of this paper was to illustrate how the Kronecker product/outer 
product is implemented, i.e. the algorithm used to represent the Kronecker/Tensor 
product, can hinder or exploit reasoning of resource management, performance, 
scalability, and portability of the algorithm. The classical way works but is not 
easy to represent, compose, and partition over processor/memory hierarchies. 
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MoA and Psi Calculus provide a way to reason about array based computing. 
By using shapes and the ip function to define a small algebra, higher order op- 
erations can be defined, composed, optimized, and mapped to a simple machine 
abstraction: start, stop, stride, count. 

Moving the theory to implementations that automatically generate correct 
optimal code is the next step. Over 20 years have been spent building prototypes 
to show proof of concept. Serious implementations must be initiated, studied, 
and advanced. 
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A. Motivation for diadics, Kronecker and outer products 

This section provides some simple examples of how dyadics and Kronecker 
products arise naturally in applied problems. 

A.l. Example from engineering 

In the field of electricity and magnetism the following operator arises in the 
wave equation for the electric field: 

VxVxi=-V 2 £ + V(vi). (6) 

For a known source current density J(f, t) (with a known Fourier expansion) it 
is natural to expand the electric field in a Fourier expansion. Thus we are let 
to consider the action of the operator of Eq. [6] on a single Fourier component: 

= E(q,uj)exp(i(q- r- ujt)) (7) 

Action on Eq. with the operator of Eq. [5] gives: 

q 2 E-q{q-E). (8) 

This is simplified by introducing the dyadic (or Kronecker product) : q <g) q, by 
writing: 

q 2 E-q(q-E) = (q 2 I-q®q)-E, (9) 

where I is the unit tensor (matrix) and the dyadic q®q, is defined by its action 
on any other vector u as follows: 

[q®q)-u = q{q-u). (10) 

A convenient interpretation of the dyadic q® q, arises if we work with the unit 
vector q = q/q, where q is the magnitude of the vector q. In terms of the unit 
vector q, Eq. [5J becomes: 

q 2 E-q(q-E) = q 2 (i-q<Z>q)-E, (11) 

and we recognize the operator in parenthesis on the right hand side of Eq. [TT] 
as a projection operator. 

Indeed, the vector (q ® q) ■ E represents the component of E along the 
direction of q and the vector (J — q® q) ■ E represents the component of E 
perpendicular to q. Explicitly we see 

(I - q <g> q) ■ (q O q) = 0. (12) 

This follows from 

(g® q) ■ (?<8 q) = q®q, (13) 
which is a natural consequence of the above definitions. 
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Another contribution to the equation for electromagnetic waves in an anisotropic 
medium is the displacement field D that is related to the electric field E by the 
equation: 

D = e-E, (14) 

where e is the dielectric tensor with diagonal components en and off-diagonal 
components e±. The components e\\ and e± arise from the fact that the re- 
sponse of the medium is different for electric field components parallel to, and 
perpendicular to the direction of wave propagation q, respectively. 

Using the dyadic notation, the dielectric tensor e is conveniently written as 

e = e| | (g <g> q)+e±(I -q®q). (15) 

The complete wave equation, in Fourier space, reads: 

A(q,u)-E(q,w) = b(q,uj) (16) 

where the operator A(q,uj) is the sum of the longitudinal component: 

A\ ] tf,w)= U ' € ^' W \ q<»$), (17) 
and the transverse component: 

A x {q^)={^^-q^{I-q^q). (18) 

The right hand side of the wave equation (Eq. [T6|) is defined in terms of the 
known current density J(q, lo) by: 

btf,v) = ^-Jtf,w) (19) 
cr 

The wave equation (Eq. I16[) is solved through the use of the dyadic Green's 
function G(q, u>) 

E(q, lo) = G{q, lo) ■ b{q, u), (20) 

where G(q, lo) is the inverse of A(q, to) and thus satisfies: 

G{q,w)-A{q,u)=L (21) 

The longitudinal and transverse components of the dyadic Green's function are 
given explicitly by: 

and, 



G±{q,u)) = - ? r-. (23) 



u> 2 e ± (q,Lj) _ ^2 



Thus a complete solution of the original problem is obtained by Fourier trans- 
forming Eq. [201 
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A. 2. Example from linear algebra: matrix decompositions 

Kronecker products (dyadics) can also be conveniently used to express a ma- 
trix expansion. Consider a Hermitian matrix H and its normalized eigenvectors 
Hj (i.e. UiUj = Sij) and eigenvalues Xj satisfying Huj = XjUj. 

A well-known result of linear algebra is that the matrix H can be expressed 
in terms of the following expansion involving Kronecker products: 

H = X-lux <g)Ui,+A 2 U2 ®U2-\ • (24) 

This expansion follows from the fact that the eigenvectors form a complete basis 
and, as such, any arbitrary vector can be expanded as a sum of the eigenvectors 
as: 

v = c x u\ + c 2 U2 + ■■■ (25) 

We see again, the natural interpretation of the Kronecker products as projection 
operators. Each term in the expansion of Eq. [2U gives a non-zero result only 
when acting on the corresponding eigenvector of Eq. [25] The result, XjCjilj, is 
identical to the action of H acting on the corresponding component c\U\ in the 
vector expansion of Eq. 1251 In other words, we find 

(ui <g) Hi) ■ vij = SijUi. (26) 

Thus, this section is consistent with the previous section in terms of the inter- 
pretation of Kronecker products as projection operators. 



A. 3. Higher dimensional generalizations 

We now consider an example from the theory of orthogonal functions (i.e. 
Hilbert space). For this discussion, it is convenient to use Dirac notation. We 
expand a given function |?/>> in a complete set of basis functions \£, m> as: 

\i»=J2 C t,™\t,m>. (27) 

By orthogonality we see that the coefficients can be written in terms of an inner 
product: 

C t , m =<£,m\il», (28) 

which is interpreted as a projection onto the basis vector (function) \£, m>. We 
now wish to expand in another complete basis set \£' ,m' >, perhaps obtained 
from the starting set by rotating the coordinate system 

Yl Bv, m >\i',m'> . (29) 

Cm' 

The coefficients of this expansion are likewise expressed as: 

B t i tm ,=<e,m'\il», (30) 
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and we can relate the coefficients of this later expansion to the coefficients the 
former expansion by taking the inner product of Eq. [29] with the basis function 
\£, m>, to yield: 

<£,m\tp>= < £,m\£',m'>< £',m'\ip >, (31) 
t',m' 

which can more simply be written: 

Ct, m =J2<t, m\f, m' > B t ,, m , (32) 
l',m' 

From equations 1311 and 1321 we see the natural definition of the unit operator: 

I=J2 \£',m'><e',m'\ (33) 

l',m' 

Note the close analogy between this expansion and the expansion of the Her- 
mitian matrix in terms of its eigenvectors in Eq. [M] We see therefore, that a 
higher-dimensional analog of Eq. [M] would be the operator expansion of 

L=J2 x ^\e',m'><£',m'\, (34) 

t',rri 

where \i, m and \£, m > are the eigenvalues and eigenfunctions of the operator 
L, respectively (i.e. L\£,m>— Xi >m \£,m>). 

If we express operators such as Eq. 1341 in matrix form, we are naturally led 
to a higher dimensional generalization of the dyadic u (g> u, namely, the matrix 
product: A® A, where A is a matrix or higher dimensional tensor. Compositions 
of such products, such as A ® B ® C are also similarly defined. 

A. 4- Matrix form 

The matrix representation of the Kronecker Product is 

(A®B)i tJ = AijBt, m (35) 

where I =< i,£ > and J =< j, m > are composite indices that cycle through 
the integers as < i, I > and < j, m > cycle through their allowed values in row 
major order (i.e. £ cycles faster than i and m cycles faster than j), and the 
eigenvector Uj is constructed as a large column vector with as many copies of 
the eigenvector Ui of B as there are columns of A. 
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